Hut101 19 jhwisdom by jhwisdom · Pull Request #577 · maps-as-data/MapReader

jhwisdom · 2026-02-04T18:44:19Z

Summary

This pull request adds support for Hugging Face models within the ClassifierContainer. Previously, users had to manually load Hugging Face models and feature extractors before passing them to the container. Now, by simply passing a Hugging Face repository path and setting the huggingface=True flag, the container handles the initialization automatically. (It is a part of the participation in the hut101 opportunities)

Fixes #192

Describe your changes

Updated ClassifierContainer.__init__: Added a huggingface boolean flag (defaulting to False).
Integrated transformers library:
- Implemented conditional loading of models using AutoModelForImageClassification.from_pretrained.
- Added ignore_mismatched_sizes=True to allow easy fine-tuning on custom labels.
Compatibility:
- Set self.is_inception = False for HF models to bypass legacy Inception-specific logic while maintaining the existing _get_logits workflow.
- Used getattr to dynamically set self.input_size from the processor's configuration, ensuring compatibility across different HF models.

Checklist before assigning a reviewer (update as needed)

Reviewer checklist

Please add anything you want reviewers to specifically focus/comment on.

Everything looks ok?

maps-as-data#19)

rwood-97

Hi Jihye,

I've added a few small comments.

Could you also have a go at updating the docs, this is the file you'd need to edit: https://github.com/maps-as-data/MapReader/blob/main/docs/source/using-mapreader/step-by-step-guide/4-classify/train.rst

The unit tests are failing at the moment but this isn't your fault - it is a dependency issue I think so I will try fix these in a separate branch.
The only one you need to fix is the check changelog test, this basically checks you've updated the changelog CHANGELOG.md as part of your PR. To fix it you just need to update the CHANGELOG.md file with your changes.

Thanks for doing this :)

rwood-97 · 2026-02-12T08:01:58Z

mapreader/classify/classifier.py

                )

            self.labels_map = labels_map
+            self.huggingface = huggingface


I would skip setting self.huggingface since its only referenced further down in the init and instead just use the huggingface value in the if statement below

rwood-97 · 2026-02-12T08:08:10Z

mapreader/classify/classifier.py

+                        num_labels=num_labels,
+                        ignore_mismatched_sizes=True
+                    ).to(self.device)
+                    self.hf_processor = AutoImageProcessor.from_pretrained(model)


I think this could also be just hf_processor instead of an attribute self.hf_processor since it isn't used outside of this function

rwood-97 · 2026-02-12T08:10:54Z

mapreader/classify/classifier.py

        is_inception: bool = False,
        load_path: str | None = None,
        force_device: bool = False,
+        huggingface=False,


can you add a type hint here

Suggested change

huggingface=False,

huggingface: bool = False,

rwood-97 · 2026-02-12T08:21:40Z

mapreader/classify/classifier.py

+                        ignore_mismatched_sizes=True
+                    ).to(self.device)
+                    self.hf_processor = AutoImageProcessor.from_pretrained(model)
+                    self.input_size = getattr(self.hf_processor, "size", {"height": 224})["height"]


Looking here there seem to be 3 options for how size is defined.

Could you implement these, i.e.:

Suggested change

self.input_size = getattr(self.hf_processor, "size", {"height": 224})["height"]

size = getattr(hf_processor, "size", {})

if "height" in size and "width" in size:

self.input_size = (size["height"], size["width"])

elif "shortest_edge" in size:

self.input_size = (size["shortest_edge"], size["shortest_edge"])

else:

self.input_size = input_size

Jihye Jeong added 2 commits February 4, 2026 19:11

feat: add Hugging Face support to ClassifierContainer (maps-as-data#19)

2d18805

feat: fix typo and support Hugging Face models (maps-as-data#192/hut101-

8cf2c25

maps-as-data#19)

jhwisdom marked this pull request as draft February 4, 2026 18:51

jhwisdom marked this pull request as ready for review February 4, 2026 18:52

jhwisdom mentioned this pull request Feb 4, 2026

MapReader: Create a function which allows us to load and safe huggingface models directly into MapReader alan-turing-institute/hut101#19

Open

rwood-97 requested changes Feb 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hut101 19 jhwisdom#577

Hut101 19 jhwisdom#577
jhwisdom wants to merge 2 commits intomaps-as-data:mainfrom
jhwisdom:hut101-19-jhwisdom

jhwisdom commented Feb 4, 2026

Uh oh!

rwood-97 left a comment

Uh oh!

rwood-97 Feb 12, 2026

Uh oh!

rwood-97 Feb 12, 2026

Uh oh!

rwood-97 Feb 12, 2026

Uh oh!

rwood-97 Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-                    self.input_size = getattr(self.hf_processor, "size", {"height": 224})["height"]
+                    size = getattr(hf_processor, "size", {})
+                    if "height" in size and "width" in size:
+                        self.input_size = (size["height"], size["width"])
+                    elif "shortest_edge" in size:
+                        self.input_size = (size["shortest_edge"], size["shortest_edge"])
+                    else:
+                        self.input_size = input_size

Conversation

jhwisdom commented Feb 4, 2026

Summary

Describe your changes

Checklist before assigning a reviewer (update as needed)

Reviewer checklist

Uh oh!

rwood-97 left a comment

Choose a reason for hiding this comment

Uh oh!

rwood-97 Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

rwood-97 Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

rwood-97 Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

rwood-97 Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants